13 research outputs found
Using Hashing to Solve the Dictionary Problem (In External Memory)
We consider the dictionary problem in external memory and improve the update
time of the well-known buffer tree by roughly a logarithmic factor. For any
\lambda >= max {lg lg n, log_{M/B} (n/B)}, we can support updates in time
O(\lambda / B) and queries in sublogarithmic time, O(log_\lambda n). We also
present a lower bound in the cell-probe model showing that our data structure
is optimal.
In the RAM, hash tables have been used to solve the dictionary problem faster
than binary search for more than half a century. By contrast, our data
structure is the first to beat the comparison barrier in external memory. Ours
is also the first data structure to depart convincingly from the indivisibility
paradigm
Lower bound techniques for data structures
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 135-143).We describe new techniques for proving lower bounds on data-structure problems, with the following broad consequences: * the first [omega](lg n) lower bound for any dynamic problem, improving on a bound that had been standing since 1989; * for static data structures, the first separation between linear and polynomial space. Specifically, for some problems that have constant query time when polynomial space is allowed, we can show [omega](lg n/ lg lg n) bounds when the space is O(n - polylog n). Using these techniques, we analyze a variety of central data-structure problems, and obtain improved lower bounds for the following: * the partial-sums problem (a fundamental application of augmented binary search trees); * the predecessor problem (which is equivalent to IP lookup in Internet routers); * dynamic trees and dynamic connectivity; * orthogonal range stabbing. * orthogonal range counting, and orthogonal range reporting; * the partial match problem (searching with wild-cards); * (1 + [epsilon])-approximate near neighbor on the hypercube; * approximate nearest neighbor in the l[infinity] metric. Our new techniques lead to surprisingly non-technical proofs. For several problems, we obtain simpler proofs for bounds that were already known.by Mihai PÇŽtraÅŸcu.Ph.D
Using Hashing to Solve the Dictionary Problem (In External Memory)
We consider the dictionary problem in external memory and improve the update time of the well-known buffer tree by roughly a logarithmic factor. For any λ ≥ max{lg lg n, logM/B(n/B)}, we can support updates in time O(λ/B) and queries in sublogarithmic time, O(log λ n). We also present a lower bound in the cell-probe model showing that our data structure is optimal. In the RAM, hash tables have been use to solve the dictionary problem faster than binary search for more than half a century. By contrast, our data structure is the first to beat the comparison barrier in external memory. Ours is also the first data structure to depart convincingly from the indivisibility paradigm.info:eu-repo/semantics/publishe
One Table to Count Them All: Parallel Frequency Estimation on Single-Board Computers
Sketches are probabilistic data structures that can provide approximate
results within mathematically proven error bounds while using orders of
magnitude less memory than traditional approaches. They are tailored for
streaming data analysis on architectures even with limited memory such as
single-board computers that are widely exploited for IoT and edge computing.
Since these devices offer multiple cores, with efficient parallel sketching
schemes, they are able to manage high volumes of data streams. However, since
their caches are relatively small, a careful parallelization is required. In
this work, we focus on the frequency estimation problem and evaluate the
performance of a high-end server, a 4-core Raspberry Pi and an 8-core Odroid.
As a sketch, we employed the widely used Count-Min Sketch. To hash the stream
in parallel and in a cache-friendly way, we applied a novel tabulation approach
and rearranged the auxiliary tables into a single one. To parallelize the
process with performance, we modified the workflow and applied a form of
buffering between hash computations and sketch updates. Today, many
single-board computers have heterogeneous processors in which slow and fast
cores are equipped together. To utilize all these cores to their full
potential, we proposed a dynamic load-balancing mechanism which significantly
increased the performance of frequency estimation.Comment: 12 pages, 4 figures, 3 algorithms, 1 table, submitted to EuroPar'1
Picture-Hanging Puzzles
We show how to hang a picture by wrapping rope around n nails, making a
polynomial number of twists, such that the picture falls whenever any k out of
the n nails get removed, and the picture remains hanging when fewer than k
nails get removed. This construction makes for some fun mathematical magic
performances. More generally, we characterize the possible Boolean functions
characterizing when the picture falls in terms of which nails get removed as
all monotone Boolean functions. This construction requires an exponential
number of twists in the worst case, but exponential complexity is almost always
necessary for general functions.Comment: 18 pages, 8 figures, 11 puzzles. Journal version of FUN 2012 pape
The geometry of binary search trees
We present a novel connection between binary search trees (BSTs) and points in the plane satisfying a simple property. Using this correspondence, we achieve the following results: 1. A surprisingly clean restatement in geometric terms of many results and conjectures relating to BSTs and dynamic optimality. 2. A new lower bound for searching in the BST model, which subsumes the previous two known bounds of Wilber [FOCS’86]. 3. The first proposal for dynamic optimality not based on splay trees. A natural greedy but offline algorithm was presented by Lucas [1988], and independently by Munro [2000], and was conjectured to be an (additive) approximation of the best binary search tree. We show that there exists an equal-cost online algorithm, transforming the conjecture of Lucas and Munro into the conjecture that the greedy algorithm is dynamically optimal.